Multivariate analysis of green to ultraviolet spectra of cell and tissue samples

ABSTRACT

This invention relates to methods for processing in vivo skin auto-fluorescence spectra for determining blood glucose levels. The invention also relates to methods of classifying cells or tissue samples or quantifying a component of a cell or tissue using a multivariate classification or quantification model.

RELATED APPLICATION

[0001] The present invention claims priority to U.S. Provisional PatentApplication No. 60/183,356, filed Feb. 18, 2000, and titled“Multivariate Analysis of Green to Ultraviolet Spectra of Cell andTissue Samples.”

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] This invention relates to analysis methodology and multivariateclassification of diagnostic spectra, and, in particular, to methods forprocessing in vivo skin auto-fluorescence spectra for determining bloodglucose levels. The invention also relates to methods of classifyingcells or tissue samples or quantifying a component of a cell or tissueusing a multivariate classification or quantification model.

[0004] 2. Description of the Background

[0005] Near-IR spectra taken from agricultural samples, such as grains,oil, seeds and feeds, etc., have been used to quantitate various bulkconstituents, e.g., total protein, water content, or fat content. See,P. Williams et al., “Agricultural Applications of Near-IR Spectroscopyand PLS Processing,” Canadian Grain Commission.

[0006] Multivariate statistical methods have long been used in theanalysis of biomedical samples by infrared and near infrared, generallyunder the name “chemometrics.” See, U.S. Pat. No. 5,596,992 to Haalandet al., titled “Multivariate Classification of Infrared Spectra of Celland Tissue Samples,” and U.S. Pat. No. 5,857,462 to Thomas et al.,titled “Systematic Wavelength Selection For Improved MultivariateSpectral Analysis.”

[0007] The use of multivariate methods for the analysis of ex vivotissue samples is well established. For spectra taken in vivo, some workhas been done. Linear discriminant analysis has been used to classifyvisible/near-IR spectra of human finger joints into early and laterheumatoid arthritis classes. Multivariate methods have been used toclassify fluorescence spectra taken in vivo from cervixes according tothe presence or absence of cervical cancer or pre-cancerous tissues.

[0008] In general, the field of chemometrics is well established, andthe use of multivariate statistical methods for the analysis of complexspectra is common. These methods are used in pharmaceutical analysis,industrial applications, and, more recently, biomedical spectralanalysis.

SUMMARY OF THE INVENTION

[0009] Recently, it has been discovered that glucose levels can bedetermined in vivo by measuring fluorescence spectra emitted from theskin surface following excitation with one or more wavelengths. See,U.S. patent application Ser. No. 09/287,486, titled, “Non-InvasiveTissue Glucose Level Monitoring,” filed Apr. 6, 1999, and incorporatedin its entirety herein by reference. Particularly, peak ratios,correlation analysis, and linear regression analysis have been used toanalyze skin autofluorescence spectra for the purpose of determining theblood glucose concentration. Partial least squares (“PLS”) analysis ofnear-IR spectra is the basis of all infrared efforts towardsnon-invasive glucose monitoring.

[0010] Analysis of collected spectra is complicated by the fact that itcan be difficult to distinguish changes or variations in the spectra dueto skin variables, such as skin inhomogeneity, UV damage, age, erythema,and the like. The present invention addresses this problem by providinga method of processing in vivo skin auto fluorescence spectra to accountfor these types of variables.

[0011] Accordingly, one embodiment of the invention is directed to amethod for processing in vivo skin auto fluorescence spectra emitted bya skin surface of a patient to determine a blood glucose level of thepatient. The method comprises the steps of collecting auto fluorescentspectra emitted from the skin surface of the patient, and correcting thecollected spectra using multivariate analysis techniques to account forvariables among skin surfaces.

[0012] Another embodiment is directed to an instrument for determining acorrect glucose level of a patient by measuring in vivoauto-fluorescence of the patient's skin comprising: means forirradiating the skin with a plurality of excitation wavelengths; meansfor collecting a plurality of emitted wavelengths; and means foranalyzing the collected wavelengths to determine a preliminary bloodglucose level. The means for analyzing comprises a means for correctingthe preliminary blood glucose level to account for variations in skinusing one or more multivariate analytical techniques to determine thecorrect glucose level of the patient.

[0013] In addition, the present invention also relates to methods ofclassifying cells or tissue samples, or quantifying their components,using multivariate analysis of the measured intensities of a pluralityof wavelengths of emitted radiation.

[0014] Other embodiments and advantages of the invention are set forthin part in the description which follows, and in part, will be obviousfrom this description, or may be learned from the practice of theinvention.

DESCRIPTION OF THE INVENTION

[0015] As embodied and broadly described herein, the present inventionis directed to the processing of in vivo skin auto-fluorescence spectrafor the purposes of determining blood glucose levels. In-vivofluorescence spectra have been shown to correlate with blood glucoselevels. See, Id. Although large changes in skin fluorescence spectra dueto changes in blood glucose levels have been observed, it can sometimesbe difficult to separate the variations in the spectra caused by changesin blood glucose from other spectral changes due to factors such as skininhomogeneity, age effects, UV damage, erythema, etc.

[0016] For large subject populations, it is desirable to be able todetermine an algorithm for converting skin fluorescence spectra intoglucose values which works on a large percentage of the population asopposed to a single individual. Thus, there is a need for an analysismethod that takes into account more spectral information than that whichis found at a single wavelength or two.

[0017] By analyzing large numbers of spectra from a wide range ofindividuals, a useful instrument for the non-invasive monitoring ofglucose using fluorescence excitation spectroscopy may be developedwhich accommodates differences in skin. By using multivariatestatistical approaches, a quantitation algorithm useful across manyindividuals may be created. Many multivariate techniques are useful inthis regard. Useful analytical methodologies include, but are notlimited to: quantification methodologies, such as, partial leastsquares, principal component regression (“PCR”), linear regression,multiple linear regression, stepwise linear regression, ridgeregression, radial basis functions, and the like; classificationmethodologies, such as, linear discriminant analysis (“LDA”), clusteranalysis (e.g., k-means, C-means, etc., both fuzzy and hard), neuralnetwork (“NN”) analysis; and data processing methodologies, such as, 1-Dor 2-D smoothing filters (based on median-filtering, mean filtering,discrete cosine, wavelet, or Fourier transform), Laplacian operators,maximum likelihood estimators, maximum entropy methods, first and secondderivatives (both in 1-D and 2-D implementations), peak enhancementmethods (such as Fourier self-deconvolution), principal componentsanalysis as a pre-processing step, and varimax rotations for PLS and PCmethods.

[0018] Other methodologies and data processing methods may furtherinclude sorting data according to their glucose values, followed by theapplication of one or more data filtering/smoothing algorithms, withinan individual in a small dataset or within each individual for larger,multiple-person datasets. Sorting by glucose or any other relevantanalyte has at least two desirable effects: (1) it groups data withsimilar glucose values together, so that the subsequent application offiltering techniques will reduce “noise” not attributable to glucose,and (2) it reduces temporal correlation inherent in preserving a datasetas a time series, and thereby reduces spurious correlation effects.

[0019] In addition or alternately, spectral transformation algorithmsmay be applied to each person's data prior to smoothing or sorting.These transfer functions will enable calibrations made on spectra fromone individual to be more easily transferable to spectra from anotherindividual or individuals by minimizing the spectral differences betweenthem. Such algorithms may be as simple as the ratio of the means of thespectra of two individuals, or some complex algorithm which takes intoaccount the responsivity characteristics of each spectrometer.

[0020] Methods of the invention may also include pre-classification ofspectra into categories of glucose levels prior to quantification. Thiscan be done with any of the supervised classification methods listedabove, e.g., LDA, PCR, NN, and the like. Sequential binary division ofspectra may also be applied, e.g., above/below 150, then, if below 150,above/below 100, if above 150, then above/below 200, etc.

[0021] Furthermore, non-linear model fitting techniques can be used toincorporate prior models of absorption and emission spectra of knownfluorophores, and subsequently use the parameters of the best fit modelas part of the multivariate analysis.

[0022] In addition, methods of the invention may also usewavelength-selection algorithms to reduce the number of spectral datapoints prior to classification or quantitation. Examples of thesemethods include genetic algorithm methodologies, step-wise linearregression and comprehensive combinatorial linear discriminant analysis,and the like.

[0023] Accordingly, one embodiment of the invention is directed to amethod for processing in vivo skin auto fluorescence spectra emitted bya skin surface of a patient to determine a blood glucose level of thepatient comprising the steps of collecting auto fluorescence spectraemitted from the skin surface of the patient and correcting thecollected spectra using multivariate analysis methods to account forvariables among skin surfaces. The multivariate analysis method maycomprise one or more quantification, classification or data processingmethods selected from the group consisting of partial least squares,principal component regression, linear regression, multiple linearregression, stepwise linear regression, ridge regression, lineardiscriminant analysis, cluster analysis (k-means, C-means, etc., bothfuzzy and hard), neural network analysis, smoothing filters (based onmedian filtering, mean filtering, discrete cosine, wavelet and Fouriertransform smoothing all in both 1-D and 2-D methods), laplacianoperators, maximum likelihood estimators, maximum entropy methods, firstand second derivatives (both in 1-D and 2-D implementations), peakenhancement methods such as Fourier self-deconvolution, principalcomponents analysis as a pre-processing step, and varimax rotations forPLS and PC methods.

[0024] Another embodiment is directed to an instrument for determining acorrect glucose level of a patient by measuring in vivoauto-fluorescence of the patient's skin comprising: means forirradiating the skin with a plurality of excitation wavelengths; meansfor collecting a plurality of emitted wavelengths; and means foranalyzing the collected wavelengths to determine a preliminary bloodglucose level. The means for analyzing comprising means for correctingthe preliminary blood glucose level to account for variations in skin.The means for correcting comprising using one or more multivariateanalytical methodologies to determine the correct glucose level of thepatient.

[0025] Quantification Models

[0026] The present invention is useful for quantifying components in acell or tissue, and may be used, for example, to quantify tissue glucoselevels in vivo. Accordingly, one embodiment of the invention is directedto a method of quantifying a component of a cell or tissue samplecomprising the steps of: generating a single excitation wavelength orplurality of different excitation wavelengths of green to ultravioletlight; irradiating the sample with the light and measuring the intensityof the stimulated emission of the sample at a minimum of three differentwavelengths of lower energy than the excitation light or at a pluralityof wavelengths of lower energy than the excitation light; andquantifying one or more components of the cell or tissue from themeasured intensities by using a multivariate quantification model. Thegreen to ultraviolet light may be in the green to violet range ofwavelengths, or alternately, it may be in the violet to near-ultravioletrange of wavelengths.

[0027] The component quantified may be glucose or another desiredcomponent. Irradiating may be done in vivo or in vitro.

[0028] In a preferred embodiment, the step of quantifying the componentof the sample includes at least one spectral data pre-processing step.In one such embodiment, the pre-processing step includes at least one ofthe steps of selecting wavelengths, correcting for a linear baseline,and normalizing a spectral region surrounding the different wavelengths,used for classification of one spectral band in that spectral region.Alternately, the pre-processing step includes at least one of the stepsof normalizing for total area of the spectrum, filtering or smoothingthe data, or pre-sorting by analyte.

[0029] Multivariate quantification may be done by a partial leastsquares technique, by a principal component regression technique, or byone of multiple linear regression, stepwise linear regression or ridgeregression.

[0030] In a preferred embodiment of this method, the step of quantifyingthe component of the sample is performed by a multivariate algorithmusing the measured intensity information and at least one multivariatequantification model which is a function of conventionally determinedcell or tissue component quantities from a set of reference samples anda set of spectral intensities as a function of wavelength obtained fromirradiating the set of reference samples with green to ultraviolet lightand monitoring the stimulated emission.

[0031] Another embodiment of the invention is directed to a method ofquantifying a component of a cell or tissue sample comprising:generating a single excitation wavelength or plurality of differentexcitation wavelengths of mid-ultraviolet light; irradiating the samplewith said light and measuring the intensity of the stimulated emissionof the sample at a minimum of three different wavelengths of lowerenergy than the excitation light or at a plurality of wavelengths oflower energy than the excitation light; generating at least onemultivariate quantification model, said model quantifying the differentcomponents of the sample as a function of the intensity characteristicsat the measured wavelengths in relation to a reference quantitationresult; calculating the quantity of the component from the measuredintensities by using multivariate quantitation of the intensities at theat least three different wavelengths based on the quantitation model;and quantifying the component from the measured intensities by usingsaid multivariate quantification model.

[0032] As with the previous embodiment, the sample component may bequantified in vitro or in vivo. Components which may be analyzed includeglucose.

[0033] Preferably, the step of quantifying the component of the samplesincludes at least one spectral data pre-processing step. Thepre-processing step preferably includes at least one of the steps ofselecting wavelengths, correcting for a linear baseline, and normalizinga spectral region surrounding the different wavelengths, used forclassification of one spectral band in that spectral region.Alternately, the pre-processing step includes at least one of the stepsof normalizing for total area of the spectrum, filtering or smoothingthe data, or pre-sorting the data by analyte. Multivariatequantification may be done by a partial least squares technique, by aprincipal component regression technique, or by one of multiple linearregression, stepwise linear regression or ridge regression.

[0034] In a preferred embodiment, the step of quantifying the componentof the sample is performed by a multivariate algorithm using themeasured intensity information and at least one multivariatequantification model which is a function of conventionally determinedcell or tissue component quantities from a set of reference samples anda set of spectral intensities as a function of wavelength obtained fromirradiating the set of reference samples with green to ultraviolet lightand monitoring the stimulated emission.

[0035] The present invention is also directed to a system forquantifying one or more components of a cell or tissue samplecomprising: means for generating a single excitation wavelength or aplurality of different excitation wavelengths of green to ultravioletlight, means for directing at least a portion of the green toultraviolet light into the sample; means for collecting at least aportion of the stimulated emission light after the excitation light hasinteracted with the sample; means for measuring an intensity of thecollected stimulated emission light at least three differentwavelengths; means, coupled to the measuring means, for storing themeasured intensities as a function of the wavelength; means for storingat least one multivariate quantification model which contains dataindicative of a correct quantification of components of known cell ortissue samples; and processor means coupled to the means for storing themeasured intensities and the means for storing the model, the processormeans serving as means for calculating the quantity of the components ofthe cell or tissue sample by use of the multivariate quantificationmodel and the measured intensities.

[0036] In one embodiment of the system, the means to direct the lightand the means to collect the light comprise an endoscope. Alternately,the means to direct the light and the means to collect the light maycomprise a fiber optic bundle. The system may further include means todetermine outliers.

[0037] Classification Models

[0038] The present invention may also be used to classify cells ortissue samples. For example, one such embodiment is directed to a methodof classifying a cell or tissue sample comprising the steps of:generating a single excitation wavelength or plurality of differentexcitation wavelengths of green to ultraviolet light; irradiating thesample with said light and measuring the intensity of the stimulatedemission of the sample at a minimum of three different wavelengths oflower energy than the excitation light or at a plurality of wavelengthsof lower energy than the excitation light; and classifying the sample asone of two or more cell or tissue types from the measured intensities byusing a multivariate classification model.

[0039] The green to ultraviolet light may be in the green to violetrange of wavelengths, or alternately, in the violet to near-ultravioletrange of wavelengths.

[0040] The sample may be classified as normal or abnormal. Irradiatingmay be done in vivo or in vitro.

[0041] Preferably, the step of classifying the samples includes at leastone spectral data pre-processing step. For example, the pre-processingstep may include at least one of the steps of selecting wavelengths,correcting for a linear baseline, and normalizing a spectral regionsurrounding the different wavelengths, used for classification of onespectral band in that spectral region. Alternately, the pre-processingstep may include at least one of the steps of normalizing for total areaof the spectrum, filtering or smoothing the data, or pre-sorting thedata by analyte.

[0042] Multivariate classification may be done by a linear discriminantanalysis technique. Preferably, the linear discriminant analysis ispreceded by a principal component analyzing step limiting the number ofdiscriminant variables.

[0043] In a preferred embodiment of the method, the step of classifyingthe sample is performed by a multivariate algorithm using the measuredintensity information and at least one multivariate classification modelwhich is a function of conventionally determined cell or tissue sampleclasses from a set of reference samples and a set of spectralintensities as a function of wavelength obtained from irradiating theset of reference samples with green to ultraviolet light and monitoringthe stimulated emission.

[0044] Another embodiment of the invention is directed to a method ofclassifying a cell or tissue sample comprising: generating a singleexcitation wavelength or plurality of different excitation wavelengthsof mid-ultraviolet light; irradiating the sample with said light andmeasuring the intensity of the stimulated emission of the sample at aminimum of three different wavelengths of lower energy than theexcitation light or at a plurality of wavelengths of lower energy thanthe excitation light; generating at least one multivariateclassification model, said model classifying the sample as a function ofthe intensity characteristics at the measured wavelengths in relation toa reference classification; calculating the classification of the samplefrom the measured intensities by using multivariate classification ofthe intensities at the at least three different wavelengths based on theclassification model; and classifying the sample as one of two or morecell or tissue types from the measured intensities by using saidmultivariate classification model.

[0045] Classifying may be done in vitro or in vivo. The sample may beclassified as normal or abnormal. Preferably, the step of classifying ofthe samples includes at least one spectral data pre-processing step. Forexample, the pre-processing step may include at least one of the stepsof selecting wavelengths, correcting for a linear baseline, andnormalizing a spectral region surrounding the different wavelengths,used for classification of one spectral band in that spectral region.Alternately, the pre-processing step may include at least one of thesteps of normalizing for total area of the spectrum, filtering orsmoothing the data, or pre-sorting the data by analyte.

[0046] In one embodiment, multivariate classification is done by alinear discriminant analysis technique. In this embodiment, the lineardiscriminant analysis is preferably preceded by a principal componentanalyzing step limiting the number of discriminant variables.

[0047] In a preferred embodiment of this method, the step of classifyingthe sample is performed by a multivariate algorithm using the measuredintensity information and at least one multivariate classification modelwhich is a function of conventionally determined cell or tissue sampleclasses from a set of reference samples and a set of spectralintensities as a function of wavelength obtained from irradiating theset of reference samples with green to ultraviolet light and monitoringthe stimulated emission.

[0048] Another embodiment is directed to a system for classifying cellor tissue samples comprising: means for generating a single excitationwavelength or a plurality of different excitation wavelengths of greento ultraviolet light; means for directing at least a portion of thegreen to ultraviolet light into the samples; means for collecting atleast a portion of the stimulated emission light after the excitationlight has interacted with the samples; means for measuring an intensityof the collected stimulated emission light at least three differentwavelengths; means, coupled to the measuring means, for storing themeasured intensities as a function of the wavelength; means for storingat least one multivariate classification model which contains dataindicative of a correct classification of known cell or tissue samples;and processor means coupled to the means for storing the measuredintensities and the means for storing the model, the processor meansserving as means for calculating the classification of the cell ortissue samples as one of two or more cells or tissues by use of themultivariate classification model and the measured intensities.

[0049] In one embodiment, the means to direct the light and the means tocollect the light comprise an endoscope. Alternately, the means todirect the light and the means to collect the light comprises a fiberoptic bundle. The system may further include means to determineoutliers.

[0050] In the above embodiments, the means for generating excitationradiation may be any type of excitation source, preferably, xenon arclamps (plus appropriate filters and/or monochromators); a plurality oflaser diodes or LEDs; mercury lamps; halogen lamps; tungsten filamentlamps; or any combination thereof. Further, appropriate filters and/ormonochromators can be added.

[0051] In addition to using a fiber optic bundle or endoscope, suitablemeans for directing or collecting radiation may comprise any of thefollowing: liquid light guides; system of optical components (mirrors,lenses, etc.); individual fiber optic cables; plastic opticalcomponents; quartz optical components; or any combination thereof.

[0052] In the above embodiments, suitable means for measuring anintensity of the radiation may be selected form the group consisting ofphotodiodes; photodiode arrays; avalance photodiodes; LEDs; laserdiodes; charge couple device (CCD) detectors (arrays or individually);silicon detectors; or any combination thereof. Suitable storing meansmay be computers (hardware and software); EPROMs; programmed firmware;and the like. Further, suitable processing means may be any type ofexisting digital processing devices.

[0053] Other embodiments and uses of the invention will be apparent tothose skilled in the art from consideration of the specification andpractice of the invention disclosed herein. All references cited herein,including all U.S. and foreign patents and patent applications, arespecifically and entirely hereby incorporated herein by reference,including, but not limited to, U.S. patent application Ser. No.09/287,486, titled “Non-Invasive Tissue Glucose Level Monitoring,” filedApr. 6, 1999. U.S. patent application titled “Reduction of Inter-SubjectVariation Via Transfer Standardization,” U.S. patent application titled“Generation of Spatially-Averaged Excitation-Emission Map inHeterogeneous Tissue,” and U.S. patent application titled “Non-InvasiveTissue Glucose Level Monitoring,” all filed contemporaneously herewith,are entirely and specifically incorporated by reference. It is intendedthat the specification and examples be considered exemplary only, withthe true scope and spirit of the invention indicated by the followingclaims.

1. A method for processing in vivo skin auto fluorescence spectraemitted by a skin surface of a patient to determine a blood glucoselevel of the patient comprising the steps of: collecting autofluorescence spectra emitted from the skin surface of the patient; andcorrecting the collected spectra using multivariate analysis to accountfor skin surface variables.
 2. The method of claim 1 wherein themultivariate analysis comprises one or more quantification,classification, or data processing techniques selected from the groupconsisting of: partial least squares, principal component regression,linear regression, multiple linear regression, stepwise linearregression, ridge regression, radial basis functions, lineardiscriminant analysis, cluster analysis, neural network analysis,smoothing filters, laplacian operators, maximum likelihood estimators,maximum entropy, first and second derivatives, peak enhancement, Fourierself-deconvolution, principal components, and varimax rotations.
 3. Aninstrument for determining a correct glucose level of a patient bymeasuring in vivo autofluorescence of the patient's skin comprising:means for irradiating the skin with a plurality of excitationwavelengths; means for collecting a plurality of emitted wavelengths;and means for analyzing the collected wavelengths to determine apreliminary blood glucose level, said means for analyzing comprisingmeans for correcting the preliminary blood glucose level to account forvariations in skin, said means for correcting comprising using one ormore multivariate analytical methodologies to determine the correctglucose level of the patient.